The NeoSpan project develops a comprehensive, dedicated pipeline for neoantigen (i.e. tumor-specific antigens that can be recognized by the immune system) retrieval that considers the spatial complexity of tumor microenvironments and enables reliable identification of immunogenic targets.

Neoantigens hold great potential as targets for personalized cancer vaccines and immunotherapies, especially when their spatial context is considered, revealing patterns of immune accessibility and immune microenvironment dynamics that are otherwise obscured in bulk analyses. To this end, creating a pipeline that integrates multiple data modalities (e.g. genomics, transcriptomics, and spatial coordinates) with mutation and neoantigen prediction, empowers the accurate identification of neoantigens and assess their immunogenic potential.

The herein project aims to catalyze the design and implementation of an open-source pipeline prototype specifically for retrieving and spatial mapping neoantigens within the context of spatial omics data.

It can be divided into four primary categories:

  1. Mutation Calling
  2. Neoantigen prediction
  3. Spatial mapping of neoantigen-associated gene mutations
  4. Neoantigen evaluation using spatial and non-spatial statistics

Those categories are discussed below.

Mutation Calling

When investigating neoantigens, only the mutations that are transcribed and translated into proteins matter (since only those can become neoantigens). Therefore, mutation calling from spatial RNA-seq data will highlight the expressed mutations (translation potential) and spot their location in the tissue (e.g. tumor edge, immune region). Furthermore, by definition of the neoantigen, the mutations need to appear only in the tumor cells. To this end, for mutation calling we use a modified version of the SComatic tool, that, apart from the spatial RNA-seq data, it also incorporates region annotation information (e.g. tumor vs normal). Overall, its advantages as a mutation calling technique towards neoantigen prediction include

  1. the partial consideration of the spatial location of the mutations by inspecting spot clusters separately
  2. the report of only high quality mutations using hard filters and statistical tests
  3. the reduction of germline mutations detection through usage of matched normal tissue based on pathology annotation.

The methodology is displayed in the following figure.

We note here that this method is useful for identifying active mutations for neoantigen prediction, but not efficient for discovering mutations in general, since spatial RNA-seq datasets are characterized by low coverage, low sensitivity and specificity for mutation detection. Besides, they only detected mutation in transcribed genes and only at sites with sufficient expression.

The output of the mutation calling (VCF file) is then used an input for the neoantigen prediction.

Neoantigen Prediction

For exploring MHC-binding neoantigen candidates, we use the pVACseq tool. This additionally requires

  1. VEP annotation to extract affected amino acid sequences
  2. HLA typing

Regarding the HLA typing, there are several possibilities based on the data/information available in our sample.

  1. If there is DNA Information (WGS or WES), then accurate HLA typing is possible. Therefore, we recommend to use NetMHCpan, or, equivalently for end-to-end analysis the pVACtools.
  2. If no complete HLA gene coverage is ensured, there are the following options available:

As soon as the we have the VEP annotation and HLA typing, we can run the pVACseq and extract the “all_epitopes.aggregated.tsv” file.

Spatial mapping of neoantigen-associated gene mutations

After obtaining the neoantigen prediction results, the original 10x Visium BAM file is explored to investigate whether a spot displays the desired gene mutations. Spots with the respective mutations are characterized as neoantigen-positive, otherwise negative. Only mutations of high quality (Phred > 30) are accepted.

Neoantigen evaluation using spatial and non-spatial statistics

The last category of the NeoSpan concerns the utilization of spatial and non-spatial statistics to evaluate the neoantigens and their impact on gene expression. For the easiest implementation, the NeoSpan Dashboard has been developed. This dashboard aims to provide a directional and user-friendly tool that allows the (non-)spatially evaluation of the neoantigens and their impact, among others, on gene mutations.

More specifically, via this Dashboard, the user can:

  1. Perform all the necessary preprocessing steps (spots filtering, normalization, feature selection, dimensionality reduction, clustering)
  2. Spatially visualize the spot barcodes (with or without neoantigen status and/or region annotation)
  3. Apply deconvolution methods (with or without reference single-cell data)
  4. Investigate differential expression analysis (DEA) between neoantigen-positive and negative spots
  5. Examine spatial autocorrelation of genes selected by the user
  6. Synthesize DEA evidence in case of multiple samples.

For the aforementioned functionalities, the user can create dynamic/interactive plots, that in turn simplify the data-driven analysis implementation and improve decision-making.